Add serialize deserialize methods to XorBinaryFuse16#47
Conversation
…thods' into sis/add-serialize-deserialize-methods
lemire
left a comment
There was a problem hiding this comment.
This PR makes sense but you are making unrelated changes like making a class final, commenting on the choice of the random generator.
Fixing our typos is fine, but please minimize changes.
|
This is nice and invited. |
|
hi @lemire thank you your quick response!
done. please take a look.
do you want me to extend the serialization/deserialization to all filters? I can do that. I wasn't just sure if you have concerns about the taken approach for the serialization (there could be other options and preferences starting from the hopefully deprecated some day java vanilla serialization, serialization through Input/OutputStream and others). I have choosed to work with ByteBuffers because that opens up for me a chance to read from the direct buffers without the need to copy to heap. |
|
Looks good !!! Could you extend it to Xor8, X16, XorBinaryFuse8, XorBinaryFuse16, XorBinaryFuse32? |
|
hi @lemire I just finished the ask. please take a look. |
lemire
left a comment
There was a problem hiding this comment.
@thomasmueller : I think we should consider merging this PR.
|
I happen to need exactly the same, great to see it added. One question though - maybe an overload taking InputStream would be also, or more, useful? Say, if reading serialized filter from network or S3, I have InputStream. I can read it into ByteBuffer and then call deserialize, but that's unnecessary copy. |
|
@vprus I would encourage you to produce a PR. |
hi @lemire
nice to e-meet you. First of all I would like to say thank you for such great algorithms and implementation. This is literelly a treasure for my currrent work!
I want to transfer the
XorBinaryFuse16over the wires. Therefore I want to wrap the serialized filter data with an envolpe to support different underlying filters / algorithms (e.g. use RoaringBitmap serialized data if I really need in some edge cases exact filtering). So to avoid additional memory copying I thought do the logic around the givenByteBuffer. So the idea is the followingorg.fastfilter.Filter#getSerializedSizeand allocates aByteBufferof enough size, then it writes the headerorg.fastfilter.Filter#serialize(ByteBuffer buffer)to write the serialized filterorg.fastfilter.xor.XorBinaryFuse16#deserialize(ByteBuffer buffer)to create the filter.If this looks good to you I can expose the same approach to other filters implemented in the library.
Looking forward to hearing a feedback from you!